20 research outputs found

    A Clustering Based Classifier Ensemble Approach to Corporate Bankruptcy Prediction

    Get PDF
    Corporate bankruptcy prediction is an important research direction in finance. Building a robust prediction scheme for bankruptcy can be beneficial to several stakeholders, including management organizations, government and stockholders. Ensemble learning is a well-known technique to improve the predictive performance of classification algorithms by decreasing the generalization error and enhancing the classification accuracy. It has been a well-established technique in bankruptcy prediction to enhance the predictive performance. Diversity plays an essential role in constructing robust ensemble classification schemes. In this paper, a clustering based classifier ensemble approach is presented for corporate bankruptcy prediction. In this scheme, k-means algorithm is utilized to obtain diversified training subsets. Based on the subsets, each base learning algorithms are trained and the predictions of base learning algorithms are combined by a majority voting scheme. In the empirical analysis, four classification algorithms (namely, C4.5 algorithm, k-nearest neighbour algorithm, support vector machines and logistic regression) and three ensemble learning methods (Bagging, AdaBoost and Random Subspace) are evaluated

    Upotreba rudarenja podataka u strateškom menadžmentu: analiza slučaja upotrebe pravila pridruživanja rudarenja podataka u informacijskom sustavu podataka o studentima

    Get PDF
    In today’s competitive conditions changes in business environment and business structures make strategic management an effective form of management for business and organizations. Strategic management is a current management strategy that requires setting of the appropriate strategies, plans and applications and putting them into action in order to reach the aims and goals of organizations. The process of strategic management involves setting the company’s vision, mission and objectives, determining the competitive position, and the evaluation of results obtained by strategy selection, development and application. In the application of activities related to the strategic management of business processes, the discipline of data mining, which can be defined as the process of extracting useful and meaningful patterns from large volumes of data, emerges as a viable method. In this study, strategic management and data mining disciplines and their basic concepts and applications are introduced. Apart from that, data mining methods in the context of strategic management are taken into consideration. In addition, a sample case study about the use of association rule mining algorithms in student information systems data will be presented.U današnje vrijeme kada je konkurencija svugdje jaka, promjene u poslovnom okruženju i poslovnim strukturama čine strateški menadžment učinkovitim oblikom menadžmenta u kompanijama i organizacijama. Strateški menadžment je suvremena strategija koja zahtijeva postavljanje odgovarajućih strategija, planova i programa, kao i njihovo provođenje da bi organizacije postigle svoje ciljeve. Proces strateškog menadžmenta podrazumijeva određivanje vizije, misije i ciljeva kompanije, određivanje njezine konkurentnosti i evaluaciju rezultata dobivenu odabirom, razvojem i primjenom strategije. Primjenom aktivnosti povezanih sa strateškim menadžmentom u poslovnim procesima disciplina rudarenja podataka pokazala se jako uspješnom metodom. Ona se može definirati kao proces izdvajanja korisnih i smislenih uzoraka iz veličinom golemih podataka. U ovom istraživanju prikazuju se discipline strateškog menadžmenta i rudarenja podataka, kao i njihovi osnovni pojmovi i primjena. Osim toga, u obzir su uzete metode rudarenja podataka u kontekstu strateškog menadžmenta. Uz to će se još prezentirati i primjer analize slučaja o upotrebi algoritama pravila pridruživanja u rudarenju podataka u sustavu podataka o studentima

    Generalised Decision Level Ensemble Method for Classifying Multi-media Data

    Get PDF
    In recent decades, multimedia data have been commonly generated and used in various domains, such as in healthcare and social media due to their ability of capturing rich information. But as they are unstructured and separated, how to fuse and integrate multimedia datasets and then learn from them eectively have been a main challenge to machine learning. We present a novel generalised decision level ensemble method (GDLEM) that combines the multimedia datasets at decision level. After extracting features from each of multimedia datasets separately, the method trains models independently on each media dataset and then employs a generalised selection function to choose the appropriate models to construct a heterogeneous ensemble. The selection function is dened as a weighted combination of two criteria: the accuracy of individual models and the diversity among the models. The framework is tested on multimedia data and compared with other heterogeneous ensembles. The results show that the GDLEM is more exible and eective

    Görüş sınıflandırma için makine öğrenmesi algoritmalarına dayalı bir yöntem tasarımı ve gerçekleştirimi

    No full text
    Görüş sınıflandırma, doğal dil işleme, makine öğrenmesi ve istatistik disiplinlerinden, yöntem, teknik ve araçların kullanılması ile metin belgelerinde yer alan öznel bilgilerin belirlenmesine yönelik bir araştırma alanıdır. Bu tez çalışması kapsamında, görüş madenciliği, bir metin sınıflandırma problemi olarak ele alınarak makine öğrenmesi yöntemleri aracılığıyla etkin görüş sınıflandırma yöntemleri geliştirilmiştir. Geliştirilen yöntemler üç temel çatı altında incelenebilir. Birincisi, metin sınıflandırmada karşılaşılan en önemli problemlerden biri olan yüksek boyutluluk ve seyrekliği ortadan kaldırmak amacıyla temel filtre tabanlı öznitelik seçim yöntemlerini etkin bir şekilde birleştiren genetik algoritma ile sıra birleştirmeye dayalı öznitelik seçimi yöntemidir. İkinci olarak, sınıflandırıcı topluluğunda yer alan temel öğrenme algoritmalarının, topluluk çıktısına, doğru sınıflandırma başarımlarına göre katkı koymalarına yönelik, çok amaçlı diferansiyel gelişim algoritmasına dayalı ağırlıklı oylama sınıflandırıcı topluluğu birleştirme kuralıdır. Üçüncü olarak ise, topluluk öğrenmesi sürecinin, topluluk budama aşamasında, uygun öğrenme algoritmalarının seçilmesine yönelik yöntem geliştirilmesidir. Geliştirilen sınıflandırıcı topluluğu budama yönteminde, ortak kümeleme ve metasezgisel aramadan yararlanılmıştır. Geliştirilen yöntemlere dayalı yeni bir sınıflandırma mimarisi önerilmiştir. Bu mimari ile geliştirilen yöntemler etkin bir şekilde birleştirilerek mevcut ve geliştirilen yöntemlerin bireysel performanslarına kıyasla daha iyi başarım elde edilmiştir.Opinion classification, which utilizes methods, techniques and tools from natural language processing, machine learning and statistics, is a research field to determine subjective information in the text documents. In this thesis, opinion mining is regarded as a text classification problem to build an efficient opinion classification scheme based on machine learning methods. The developed methods can be grouped into three categories. First, a feature selection method based on genetic rank aggregation, which integrates individual filter-based feature selection methods in an effective way, is presented to overcome the high dimensionality and sparsity problems encountered in text classification. Secondly, a classifier ensemble combination rule based on multi-objective differential evolution algorithm is presented so that the base learning algorithms of the classifier ensemble can contribute to the final outcome of the ensemble according to their predictive performance. Thirdly, an ensemble pruning scheme is presented to obtain an appropriate subset of classifiers from the ensemble. In the proposed pruning scheme, consensus clustering and metaheuristic search are utilized. A novel classification architecture is proposed based on the developed methods. With this framework, developed methods are combined in an effective way and the predictive performance of the framework is enhanced compared to the individual performances of the standard and the developed methods

    A review of literature on the use of machine learning methods for opinion mining

    No full text
    Görüş madenciliği, görüş sahibinin tutum, davranış, duygu gibi öznel bilgilerinin çıkarılması için doğal dil işleme, metin madenciliği, hesaplamalı dilbilim gibi bilim alanlarının tekniklerini kullanan güncel bir araştırma alanıdır. Görüş madenciliği işleminin temel olarak bir sınıflandırma problemi olarak ele alınması mümkündür. Bu nedenle, makine öğrenmesine dayalı yöntemler sıklıkla görüş sınıflandırma amacıyla uygulanmaktadır. Görüş madenciliğinde makine öğrenmesine dayalı yöntemler temel olarak, öğreticili, yarı-öğreticili ve öğreticisiz yöntemler olmak üzere üç temel sınıf altında incelenmektedir. Bu çalışma kapsamında, görüş madenciliği alanında gerçekleştirilen temel makine öğrenmesine dayalı çalışmalar ve her bir makine öğrenmesi yönteminin güçlü ve zayıf yönleri ele alınmaktadırOpinion mining is an emerging field which uses methods of natural language processing, text mining and computational linguistics to extract subjective information of opinion holders. Opinion mining can be viewed as a classification problem. Hence, machine learning based methods are widely employed for sentiment classification. Machine learning based methods in opinion mining can be mainly classified as supervised, semi-supervised and unsupervised methods. In this study, main existing literature on the use of machine learning methods for opinion mining has been presented. Besides, the weak and strong characteristics of machine learning methods have been discusse

    PROMETHEE SIRALAMA YÖNTEMİNİN KONUT PROJELERİNİN DEĞERLENDİRİLMESİNDE KULLANILMASI

    No full text
    Türk ekonomisinin en önemli ve hızla gelişen sektörlerinden biri konut sektörüdür. Konut sektörüne olan bu ilgi, birçok yeni konut projesinin satışa sunulmasına neden olmaktadır. Konut projelerinin karar vericiler tarafından değerlendirilmesi, çok kriterli bir karar verme problemi olarak ele alınabilir. Çok Kriterli Karar Verme problemlerini çözebilmek amacıyla birçok farklı yöntem geliştirilmiştir. Bu çalışma kapsamında, en etkin ve kolay uygulanabilir Çok Kriterli Karar Verme yöntemlerinden biri olan PROMETHEE yöntemi ve GAIA düzlemi kullanılarak, İzmir ili Karşıyaka ilçesinde satışa sunulmuş olan çeşitli konut projelerinin, konfor, büyüklük, oda sayısı, şehir içi konum gibi özellikler bakımından değerlendirilmesi gerçekleştirilmiştir

    Hierarchical graph-based text classification framework with contextual node embedding and BERT-based dynamic fusion

    No full text
    We propose a novel hierarchical graph-based text classification framework that leverages the power of contextual node embedding and BERT-based dynamic fusion to capture the complex relationships between the nodes in the hierarchical graph and generate a more accurate classification of text. The framework consists of seven stages: Linguistic Feature Extraction, Hierarchical Node Construction with Domain-Specific Knowledge, Contextual Node Embedding, Multi-Level Graph Learning, Dynamic Text Sequential Feature Interaction, Attention-Based Graph Learning, and Dynamic Fusion with BERT. The first stage, Linguistic Feature Extraction, extracts the linguistic features of the text, including part-of-speech tags, dependency parsing, and named entities. The second stage constructs a hierarchical graph based on the domain-specific knowledge, which is used to capture the relationships between nodes in the graph. The third stage, Contextual Node Embedding, generates a vector representation for each node in the hierarchical graph, which captures its local context information, linguistic features, and domain-specific knowledge. The fourth stage, Multi-Level Graph Learning, uses a graph convolutional neural network to learn the hierarchical structure of the graph and extract the features of the nodes in the graph. The fifth stage, Dynamic Text Sequential Feature Interaction, captures the sequential information of the text and generates dynamic features for each node. The sixth stage, Attention-Based Graph earning, uses an attention mechanism to capture the important features of the nodes in the graph. Finally, the seventh stage, Dynamic Fusion with BERT, combines the output from the previous stages with the output from a pre-trained BERT model to obtain the final integrated vector representation of the text. This approach leverages the strengths of both the proposed framework and BERT, allowing for better performance on the classification task. The proposed framework was evaluated on several benchmark datasets and compared to state-of-the-art methods, demonstrating significant improvements in classification accuracy

    SRL-ACO: A text augmentation framework based on semantic role labeling and ant colony optimization

    No full text
    The process of creating high-quality labeled data is crucial for training machine-learning models, but it can be a time-consuming and labor-intensive process. Moreover, manual annotation by human annotators can lead to varying degrees of competency, training, and experience, which can result in inconsistent labeling and arbitrary standards. To address these challenges, researchers have been exploring automated methods for enhancing training and testing datasets. This paper proposes SRL-ACO, a novel text augmentation framework that leverages Semantic Role Labeling (SRL) and Ant Colony Optimization (ACO) techniques to generate additional training data for natural language processing (NLP) models. The framework uses SRL to identify the semantic roles of words in a sentence and ACO to generate new sentences that preserve these roles. SRL-ACO can enhance the accuracy of NLP models by generating additional data without requiring manual data annotation. The paper presents experimental results demonstrating the effectiveness of SRL-ACO on seven text classification datasets for sentiment analysis, toxic text detection and sarcasm identification. The results show that SRL-ACO improves the performance of a classifier on different NLP tasks. These results demonstrate that SRL-ACO has the potential to enhance the quality and quantity of training data for various NLP tasks
    corecore